Overview

Dataset Statistics

Number of Variables 11
Number of Rows 50000
Missing Cells 0
Missing Cells (%) 0.0%
Duplicate Rows 2237
Duplicate Rows (%) 4.5%
Total Size in Memory 21.8 MB
Average Row Size in Memory 457.5 B
Variable Types
  • Numerical: 4
  • Categorical: 6
  • GeoGraphy: 1

Dataset Insights

Area_Income is skewed Skewed
Dataset has 2237 (4.47%) duplicate rows Duplicates
Ad_Topic_Line has a high cardinality: 677 distinct values High Cardinality
City has a high cardinality: 638 distinct values High Cardinality
Country has a high cardinality: 223 distinct values High Cardinality
date has a high cardinality: 198 distinct values High Cardinality
Time has a high cardinality: 575 distinct values High Cardinality
Clicked_on_Ad has constant length 1 Constant Length
date has constant length 10 Constant Length
Time has constant length 5 Constant Length

Variables


Daily_Time_Spent_on_Site

numerical

Approximate Distinct Count 600
Approximate Unique (%) 1.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 800000
Mean 64.1332
Minimum 32.6
Maximum 91.37
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Daily_Time_Spent_on_Site is skewed left (γ1 = -0.2312)

Quantile Statistics

Minimum 32.6
5-th Percentile 40.04
Q1 49.84
Median 66.63
Q3 76.44
95-th Percentile 84.59
Maximum 91.37
Range 58.77
IQR 26.6

Descriptive Statistics

Mean 64.1332
Standard Deviation 14.8426
Variance 220.3024
Sum 3.2067e+06
Skewness -0.2312
Kurtosis -1.1527
Coefficient of Variation 0.2314

Age

numerical

Approximate Distinct Count 39
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 800000
Mean 35.8354
Minimum 19
Maximum 60
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Age is skewed right (γ1 = 0.6215)

Quantile Statistics

Minimum 19
5-th Percentile 23
Q1 29
Median 34
Q3 41
95-th Percentile 53
Maximum 60
Range 41
IQR 12

Descriptive Statistics

Mean 35.8354
Standard Deviation 8.8651
Variance 78.5892
Sum 1.7918e+06
Skewness 0.6215
Kurtosis -0.1205
Coefficient of Variation 0.2474
  • Age has 846 outliers

Area_Income

numerical

Approximate Distinct Count 655
Approximate Unique (%) 1.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 800000
Mean 53927.7264
Minimum 13996.5
Maximum 79332.33
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Area_Income is skewed left (γ1 = -0.4714)

Quantile Statistics

Minimum 13996.5
5-th Percentile 31947.65
Q1 47575.44
Median 55993.68
Q3 63100.13
95-th Percentile 71455.62
Maximum 79332.33
Range 65335.83
IQR 15524.69

Descriptive Statistics

Mean 53927.7264
Standard Deviation 11413.63
Variance 1.3027e+08
Sum 2.6964e+09
Skewness -0.4714
Kurtosis -0.2254
Coefficient of Variation 0.2116
  • Area_Income is not normally distributed (p-value 2.7404378845318142e-08)
  • Area_Income has 181 outliers

Daily_Internet_Usage

numerical

Approximate Distinct Count 661
Approximate Unique (%) 1.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 800000
Mean 174.7766
Minimum 104.78
Maximum 269.96
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Daily_Internet_Usage is skewed right (γ1 = 0.1146)

Quantile Statistics

Minimum 104.78
5-th Percentile 113.7
Q1 136.18
Median 167.86
Q3 213.75
95-th Percentile 244.23
Maximum 269.96
Range 165.18
IQR 77.57

Descriptive Statistics

Mean 174.7766
Standard Deviation 42.0239
Variance 1766.0116
Sum 8.7388e+06
Skewness 0.1146
Kurtosis -1.297
Coefficient of Variation 0.2404

Ad_Topic_Line

categorical

Approximate Distinct Count 677
Approximate Unique (%) 1.4%
Missing 0
Missing (%) 0.0%
Memory Size 4957129

Length

Mean 34.1426
Standard Deviation 5.2372
Median 34
Minimum 17
Maximum 54

Sample

1st row Front-line even-ke...
2nd row Front-line fresh-t...
3rd row Enhanced maximized...
4th row Total zero adminis...
5th row Devolved regional ...

Letter

Count 1544530
Lowercase Letter 1490805
Space Separator 109871
Uppercase Letter 53725
Dash Punctuation 46761
Decimal Number 5443

City

categorical

Approximate Distinct Count 638
Approximate Unique (%) 1.3%
Missing 0
Missing (%) 0.0%
Memory Size 3844955

Length

Mean 11.8991
Standard Deviation 2.4447
Median 12
Minimum 7
Maximum 23

Sample

1st row Silvaton
2nd row West Jeremyside
3rd row Lake Vanessa
4th row Port Sherrystad
5th row South Patrickfort

Letter

Count 573175
Lowercase Letter 501395
Space Separator 21780
Uppercase Letter 71780
Dash Punctuation 0
Decimal Number 0

Gender

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 3501972

Length

Mean 5.0394
Standard Deviation 0.9992
Median 6
Minimum 4
Maximum 6

Sample

1st row Male
2nd row Male
3rd row Male
4th row Male
5th row Female

Letter

Count 251972
Lowercase Letter 201972
Space Separator 0
Uppercase Letter 50000
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Female, Male) take over 50.0%

Country

categorical

Approximate Distinct Count 223
Approximate Unique (%) 0.4%
Missing 0
Missing (%) 0.0%
Memory Size 3719408

Length

Mean 9.3882
Standard Deviation 5.5444
Median 7
Minimum 4
Maximum 44

Sample

1st row Peru
2nd row Papua New Guinea
3rd row Chile
4th row French Polynesia
5th row Bosnia and Herzego...

Letter

Count 446638
Lowercase Letter 377926
Space Separator 21460
Uppercase Letter 68712
Dash Punctuation 297
Decimal Number 58

Clicked_on_Ad

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 3300000

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 1
2nd row 0
3rd row 1
4th row 1
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 50000
  • The top 2 categories (0, 1) take over 50.0%
  • Clicked_on_Ad has words of constant length

date

categorical

Approximate Distinct Count 198
Approximate Unique (%) 0.4%
Missing 0
Missing (%) 0.0%
Memory Size 3750000

Length

Mean 10
Standard Deviation 0
Median 10
Minimum 10
Maximum 10

Sample

1st row 2016-04-04
2nd row 2016-06-18
3rd row 2016-06-26
4th row 2016-04-18
5th row 2016-07-18

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 100000
Decimal Number 400000
  • date has words of constant length

Time

categorical

Approximate Distinct Count 575
Approximate Unique (%) 1.1%
Missing 0
Missing (%) 0.0%
Memory Size 3500000

Length

Mean 5
Standard Deviation 0
Median 5
Minimum 5
Maximum 5

Sample

1st row 03:57
2nd row 16:02
3rd row 07:01
4th row 21:07
5th row 18:33

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 200000
  • Time has words of constant length

Interactions

Correlations

Missing Values